DPF Operator Installation
Download local-path-provisioner helm chart to your current working directory and create a NS for it:
Jump Node Console
$ curl https://codeload.github.com/rancher/local-path-provisioner/tar.gz/v0.0.30 | tar -xz --strip=3 local-path-provisioner-0.0.30/deploy/chart/local-path-provisioner/
$ kubectl create ns local-path-provisioner
The following values will be used for the installation:
manifests/01-dpf-operator-installation/helm-values/local-path-provisioner.yml
tolerations:
- operator: Exists
effect: NoSchedule
key: node-role.kubernetes.io/control-plane
- operator: Exists
effect: NoSchedule
key: node-role.kubernetes.io/master
Run the following command:
Jump Node Console
$ helm install -n local-path-provisioner local-path-provisioner ./local-path-provisioner --version 0.0.30 -f ./manifests/01-dpf-operator-installation/helm-values/local-path-provisioner.yml
NAME: local-path-provisioner
LAST DEPLOYED: Tue Jul 8 13:43:06 2025
NAMESPACE: local-path-provisioner
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
...
Ensure that the pod in the local-path-provisioner namespace is in the Ready state:
Jump Node Console
$ kubectl wait --for=condition=ready --namespace local-path-provisioner pods --all
pod/local-path-provisioner-75f649c47c-rsvb8 condition met
The following YAML file defines storage (for the BFB images) that are required by the DPF operator.
manifests/01-dpf-operator-installation/nfs-storage-for-bfb-dpf-ga.yaml
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: bfb-pv
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
nfs:
path: /mnt/dpf_share/bfb
server: $NFS_SERVER_IP
persistentVolumeReclaimPolicy: Delete
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: bfb-pvc
namespace: dpf-operator-system
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
volumeMode: Filesystem
storageClassName: ""
Run the following commands to first create the namespace for the DPF Operator, then substitute the environment variables using envsubst
,and apply the YAML files:
Jump Node Console
$ kubectl create namespace dpf-operator-system
$ cat manifests/01-dpf-operator-installation/*.yaml | envsubst | kubectl apply -f -
The following table lists all required Helm chart dependencies with their specific versions and purposes:
Helm Chart | Version | Description | Required | Post/Pre-installation |
1.18.1 | Certificate management for Kubernetes, provides automatic TLS certificate issuance and renewal | ✔ | Pre-installation | |
7.8.2 | GitOps continuous delivery tool for Kubernetes, necessary for DPUService integration | ✔ | Pre-installation | |
0.17.1 | Discovers and advertises hardware features and capabilities of DPUs in the cluster | ✔ | Pre-installation | |
0.2.0 | Manages node maintenance operations and ensures graceful handling of node updates | ✔ | Pre-installation | |
1.1.0 | Kubernetes cluster management platform for creating and managing the DPU Kubernetes clusters | ✔ | Pre-installation |
All of the components requires the DPF Operator to be installed before they can be installed.
We provide a working helmfile configuration that can be used to install all dependencies with the correct values.
The helmfiles are located at deploy/helmfiles/
in the DPF repository.
This approach ensures consistent deployment across different environments and simplifies the installation process.
But, this provided as a demo option and can't supported by NVIDIA official support.
Run the following commands to Install the dependencies:
Jump Node Console
$ wget https://github.com/helmfile/helmfile/releases/download/v1.1.2/helmfile_1.1.2_linux_amd64.tar.gz
$ tar -xvf helmfile_1.1.2_linux_amd64.tar.gz
$ sudo mv ./helmfile /usr/local/bin/
$ helmfile version
$ cd
$ cd doca-platform
$ cd deploy/helmfiles/
$ helmfile init --force
$ helmfile apply -f prereqs.yaml --color --suppress-diff --skip-diff-on-install --concurrency 0 --hide-notes
$ cd
$ cd docs/public/user-guides/hbn_only/
The DPF Operator Helm values are detailed in the following YAML file:
manifests/01-dpf-operator-installation/helm-values/dpf-operator.yml
kamaji-etcd:
persistentVolumeClaim:
storageClassName: local-path
node-feature-discovery:
worker:
extraEnvs:
- name: "KUBERNETES_SERVICE_HOST"
value: "$TARGETCLUSTER_API_SERVER_HOST"
- name: "KUBERNETES_SERVICE_PORT"
value: "$TARGETCLUSTER_API_SERVER_PORT"
Run the following commands to substitute the environment variables and install the DPF Operator( remove in public: For development purposes... ):
Jump Node Console
$ helm repo add --force-update dpf-repository ${REGISTRY}
$ helm repo update
$ helm upgrade --install -n dpf-operator-system dpf-operator dpf-repository/dpf-operator --version=$TAG
For development purposes, if the $REGISTRY is an OCI Registry use this command:
$ envsubst < ./manifests/01-dpf-operator-installation/helm-values/dpf-operator.yml | helm upgrade --install -n dpf-operator-system dpf-operator $REGISTRY/dpf-operator --version=$TAG --values -
Release "dpf-operator" does not exist. Installing it now.
coalesce.go:286: warning: cannot overwrite table with non table for dpf-operator.parca.server.tolerations (map[])
NAME: dpf-operator
LAST DEPLOYED: Tue May 20 23:18:22 2025
NAMESPACE: dpf-operator-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
Verify the DPF Operator installation by ensuring the deployment is available and all the pods are ready:
The following verification commands may need to be run multiple times to ensure the conditions are met.
Jump Node Console
## Ensure the DPF Operator deployment is available.
$ kubectl rollout status deployment --namespace dpf-operator-system dpf-operator-controller-manager
deployment "dpf-operator-controller-manager" successfully rolled out
## Ensure all pods in the DPF Operator system are ready.
$ kubectl wait --for=condition=ready --namespace dpf-operator-system pods --all
pod/dpf-operator-argocd-application-controller-0 condition met
pod/dpf-operator-argocd-redis-5bc74d76fc-v6l7m condition met
pod/dpf-operator-argocd-repo-server-86c9454fc9-zqtqf condition met
pod/dpf-operator-argocd-server-554d9f446-lntpv condition met
pod/dpf-operator-controller-manager-67599cdcb7-5dchf condition met
pod/dpf-operator-kamaji-6dcf4ccdfd-fg64w condition met
pod/dpf-operator-kamaji-etcd-0 condition met
pod/dpf-operator-kamaji-etcd-1 condition met
pod/dpf-operator-kamaji-etcd-2 condition met
pod/dpf-operator-maintenance-operator-666b88bfcd-p72nn condition met
pod/dpf-operator-node-feature-discovery-gc-656b95dc48-gwtsb condition met
pod/dpf-operator-node-feature-discovery-master-76d5695c7c-6kwfz condition met